170 research outputs found

    Sangam, A Confluence

    Get PDF
    Sangam, Sanskrit for "confluence," is a novel set across three storylines, all connected by a single ghazal poem, the evolution of which spans the lives and times of three men. In medieval India, Sufi poet Amir Khusrow arrives at the ruins of an ancient Hindu Temple, seeking inspiration and revival for his work; centuries later, at the turn of India's independence from Britain, young lawyer Jayant finds his idealism tested in against the nation's messy beginnings; in the present day, a young Indian-American disc jockey navigates the night club scene, hoping to become the modern music star. The novel is meant to mimic how music is sampled, re-appropriated, and remixed over time. In the same way songs are matched by a DJ for beats and melody, so too are the themes and emotional arcs of each man's story line meant to echo one another, and resonate as a whole

    The trade-offs of model size in large recommendation models : A 10000 ×\times compressed criteo-tb DLRM model (100 GB parameters to mere 10MB)

    Full text link
    Embedding tables dominate industrial-scale recommendation model sizes, using up to terabytes of memory. A popular and the largest publicly available machine learning MLPerf benchmark on recommendation data is a Deep Learning Recommendation Model (DLRM) trained on a terabyte of click-through data. It contains 100GB of embedding memory (25+Billion parameters). DLRMs, due to their sheer size and the associated volume of data, face difficulty in training, deploying for inference, and memory bottlenecks due to large embedding tables. This paper analyzes and extensively evaluates a generic parameter sharing setup (PSS) for compressing DLRM models. We show theoretical upper bounds on the learnable memory requirements for achieving (1±ϵ)(1 \pm \epsilon) approximations to the embedding table. Our bounds indicate exponentially fewer parameters suffice for good accuracy. To this end, we demonstrate a PSS DLRM reaching 10000×\times compression on criteo-tb without losing quality. Such a compression, however, comes with a caveat. It requires 4.5 ×\times more iterations to reach the same saturation quality. The paper argues that this tradeoff needs more investigations as it might be significantly favorable. Leveraging the small size of the compressed model, we show a 4.3×\times improvement in training latency leading to similar overall training times. Thus, in the tradeoff between system advantage of a small DLRM model vs. slower convergence, we show that scales are tipped towards having a smaller DLRM model, leading to faster inference, easier deployment, and similar training times

    In defense of parameter sharing for model-compression

    Full text link
    When considering a model architecture, there are several ways to reduce its memory footprint. Historically, popular approaches included selecting smaller architectures and creating sparse networks through pruning. More recently, randomized parameter-sharing (RPS) methods have gained traction for model compression at start of training. In this paper, we comprehensively assess the trade-off between memory and accuracy across RPS, pruning techniques, and building smaller models. Our findings demonstrate that RPS, which is both data and model-agnostic, consistently outperforms/matches smaller models and all moderately informed pruning strategies, such as MAG, SNIP, SYNFLOW, and GRASP, across the entire compression range. This advantage becomes particularly pronounced in higher compression scenarios. Notably, even when compared to highly informed pruning techniques like Lottery Ticket Rewinding (LTR), RPS exhibits superior performance in high compression settings. This points out inherent capacity advantage that RPS enjoys over sparse models. Theoretically, we establish RPS as a superior technique in terms of memory-efficient representation when compared to pruning for linear models. This paper argues in favor of paradigm shift towards RPS based models. During our rigorous evaluation of RPS, we identified issues in the state-of-the-art RPS technique ROAST, specifically regarding stability (ROAST's sensitivity to initialization hyperparameters, often leading to divergence) and Pareto-continuity (ROAST's inability to recover the accuracy of the original model at zero compression). We provably address both of these issues. We refer to the modified RPS, which incorporates our improvements, as STABLE-RPS

    Efficient model compression with Random Operation Access Specific Tile (ROAST) hashing

    Full text link
    Advancements in deep learning are often associated with increasing model sizes. The model size dramatically affects the deployment cost and latency of deep models. For instance, models like BERT cannot be deployed on edge devices and mobiles due to their sheer size. As a result, most advances in Deep Learning are yet to reach the edge. Model compression has sought much-deserved attention in literature across natural language processing, vision, and recommendation domains. This paper proposes a model-agnostic, cache-friendly model compression approach: Random Operation Access Specific Tile (ROAST) hashing. ROAST collapses the parameters by clubbing them through a lightweight mapping. Notably, while clubbing these parameters, ROAST utilizes cache hierarchies by aligning the memory access pattern with the parameter access pattern. ROAST is up to ∼25×\sim 25 \times faster to train and ∼50×\sim 50 \times faster to infer than the popular parameter sharing method HashedNet. Additionally, ROAST introduces global weight sharing, which is empirically and theoretically superior to local weight sharing in HashedNet, and can be of independent interest in itself. With ROAST, we present the first compressed BERT, which is 100×−1000×100\times - 1000\times smaller but does not result in quality degradation. These compression levels on universal architecture like transformers are promising for the future of SOTA model deployment on resource-constrained devices like mobile and edge device

    Novel Adsorption Cycle for High-Efficiency Adsorption Heat Pumps and Chillers: Modeling and Simulation Results

    Get PDF
    A novel thermodynamic cycle for adsorption heat pumps and chillers is presented. It shows a significant improvement of the internal heat recovery between the adsorption and the desorption half cycle. A stratified thermal storage, which allows for a temperature-based extraction and insertion of storage fluid, is hydraulically coupled with a single adsorber. The benefit is an increased efficiency by reusing the released heat of adsorption for regeneration of the adsorber and by rendering possible low driving temperature differences. For investigating the second law of this cycle, a dynamic model is employed. The transient behavior of the system and the respective losses because of driving temperature differences at the heat exchangers and losses due to mixing within the storage and to the surroundings are depicted in this one-dimensional model. The model is suitable both for analyzing this advanced cycle as well as for comparisons with other cycles

    Use of compost filter bermsfor sediment trapping: primary focus on water quality and structural stability

    Get PDF
    Runoff from road construction and maintenance sites is responsible for erosion and deposition of sediments in the receiving water bodies. In addition to soil particles from erosion, runoff also transports other pollutants such as rubber, toxic metals, automobile fluids, car exhausts (which settle with the rain), pesticides, fertilizers, and other debris. Compost has been used effectively as a valuable soil amendment to aid plant growth. Berms (mounds) of compost placed at the top or bottom of steep slopes can be used to slow the velocity of water and provide additional protection for receiving waters. However, a downside of the application of composted organic material is the potential degradation of runoff water quality. Overloading with nitrogen and phosphorus causes eutrophication, which reduces the suitability of waterways for beneficial uses. A field testing of the berms coupled with a laboratory analysis of the testing water will provide a basis for the impact of the compost berms on the runoff water quality. The study of the impact of compost on the runoff water quality was investigated. The objective of this study was to evaluate the performance of berms made from various materials such as dairy manure compost, yard waste compost and composted bio-solids mixed with wood chips in a ratio of 50:50 on the runoff water quality, as well as, the sediment removal efficiencies. Field tests were performed on the berms to simulate conventional rainfall runoff and the tested water was collected as time-weighted samples and analyzed in the laboratory. Several variables were investigated during this study. Results of this investigation demonstrated that the effectiveness of this application was hampered by the structural instability of the berm. A 100% failure rate was observed in the berms tested. Optimum performance was observed in yard waste compost berms, which introduced the least amount of contaminants into the water. However, some masking effect could be present due to berm failures. In fact, the actual sediment removal by the berms could not be determined. The study of compost filter berms showed some evidence of the existence of first flush effect

    A study of relation between primary open angle glaucoma and type II diabetes mellitus

    Get PDF
    Background: Primary open angle glaucoma has been characterized by its adult onset, IOP >21mmHg at some point in the course of the disease, open angles on gonioscopy, glaucomatous visual field changes and glaucomatous optic nerve damage. POAG is a multi-factorial disease such as age, black race, positive family history, high myopia etc. Diabetes mellitus has also been considered as one of the risk factors, but no major study has been conducted to provide tangible proof.Methods: This cross sectional, case control study was conducted to determine whether diabetes stands as a risk factor in development of glaucoma. The selected patients were divided into 3 groups based on inclusion and exclusion criteria. They were subjected to complete ocular examination including gonioscopy and perimetry.Results: The 16 patients from 50 of the diabetic group (28%) were found to have POAG. The p value was <0.005 which was statistically significant. Also, no correlation was found between blood sugar and IOP levels in these patients.Conclusions: These data show a significant correlation between diabetes and glaucoma. Further studies are warranted to determine its actual role in pathogenesis of glaucoma

    Program Synthesis using Natural Language

    Get PDF
    Interacting with computers is a ubiquitous activity for millions of people. Repetitive or specialized tasks often require creation of small, often one-off, programs. End-users struggle with learning and using the myriad of domain-specific languages (DSLs) to effectively accomplish these tasks. We present a general framework for constructing program synthesizers that take natural language (NL) inputs and produce expressions in a target DSL. The framework takes as input a DSL definition and training data consisting of NL/DSL pairs. From these it constructs a synthesizer by learning optimal weights and classifiers (using NLP features) that rank the outputs of a keyword-programming based translation. We applied our framework to three domains: repetitive text editing, an intelligent tutoring system, and flight information queries. On 1200+ English descriptions, the respective synthesizers rank the desired program as the top-1 and top-3 for 80% and 90% descriptions respectively
    • …
    corecore